Our World In Data choropleth

A post on how recreate the Our World In Data characteristic choropleth in R.

Author

Jonathan Jayes

Published

June 1, 2022

Purpose

I really look up to Max Roser and the team at Our World in Data. They have some of the best short form articles condensing a wealth of academic literature to, in their words, “make progress against the world’s largest problems”.

The mission is summed up well in a lecture given at Stellenbosch University by Max Roser this year, included below.

In this tutorial I want to walk through recreating one of their classic chart types in R, the world map choropleth with an overlayed line graph for each coutnry. A typical example shown below.

Context

There is a lot of information about the OWID grapher tool. You can have a look at their github repo and an older reddit AMA if you are interested. It’s a custom system that allows for systematic changes across their website, drawing on data from a central database.

Components

What are the parts I want to recreate? The map has:

  • a base map, where the colour fill of each country indicates it’s position in a specific measure in a particular year.

  • a simple line chart that appears when you hover over a country, showing how the measure has changed within a country over time.

  • a clear legend

  • a note specifying the source of the data

I walk through creating each of these below.

The world map

The base map is sourced from the maps package. I add a three letter country code from the english name of the country using the countrycode package and filter out Antarctica, Greenland and the French Southern and Antarctic Lands.

The base map is projected with the Web Mercator or WGS 84 projection, the same one Google Maps uses.

Data

We read in the data as a CSV file, and tidy up the column names so that they are in snake case with the clean_names() command from the very helpful janitor package.

Next we remove the summary groups which we cannot show on the map, including the World Bank country income groupings.

# A tibble: 16 × 1
   entity                      
   <chr>                       
 1 East Asia and Pacific       
 2 Europe and Central Asia     
 3 European Union              
 4 High income                 
 5 Latin America and Caribbean 
 6 Low and middle income       
 7 Low income                  
 8 Lower middle income         
 9 Middle East and North Africa
10 Middle income               
11 North America               
12 South Asia                  
13 Sub-Saharan Africa          
14 Tuvalu                      
15 Upper middle income         
16 World                       

Create a colour palette

So what we want to do is use the scale_color_viridis_c() palette. We have to map it to the min and max of the values in our dataset.

# A tibble: 1 × 2
    min   max
  <dbl> <dbl>
1   3.5  68.5

How to plot the line graph?

The line graph that appears when you hover over OWID map is very simple. It has just the starting value on the y-axis, and the first and last years on the x-axis, and a line that changes colour in accordance with the scale of the choropleth. The hover window which contains the graph also shows the country name, and the value of the measure in the most recent year.

To recreate it, we need store these four values, and draw the coloured line.

A function for plotting the graph

Now making the table

South Africa
20.3%
in 2020

Creating the plots for each country

Here we use the purrr::map command to make the table in raw HTML for each country and save it inside a tibble. The output shows an HTML list in the column called gt.

# A tibble: 162 × 2
   code  gt        
   <chr> <list>    
 1 AFG   <html [1]>
 2 ALB   <html [1]>
 3 DZA   <html [1]>
 4 AND   <html [1]>
 5 ARG   <html [1]>
 6 ARM   <html [1]>
 7 AUS   <html [1]>
 8 AUT   <html [1]>
 9 AZE   <html [1]>
10 BHS   <html [1]>
# … with 152 more rows

We thencreate a tibble called df_map that selects the most recent year for each country from the dataset and joins it to the map by the country code variable we created above. Finally we join this to the tibble of tables called gt_tables.

Creating the interactive figure

Now we are ready to create the interactive figure!

We begin by drawing a static map in grey, with data from the original map. Next we overlay the interactive choropleth. The grey static map will show through all the countries we don’t have data on in the dataset.

Show off the interactive figure